Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Prediction of protein subcellular localization based on deep learning
WANG Yihao, DING Hongwei, LI Bo, BAO Liyong, ZHANG Yingjie
Journal of Computer Applications    2020, 40 (11): 3393-3399.   DOI: 10.11772/j.issn.1001-9081.2020040510
Abstract419)      PDF (678KB)(454)       Save
Focused on the issue that traditional machine learning algorithms still need to manually represent features, a protein subcellular localization algorithm based on the deep network of Stacked Denoising AutoEncoder (SDAE) was proposed. Firstly, the improved Pseudo-Amino Acid Composition (PseAAC), Pseudo Position Specific Scoring Matrix (PsePSSM) and Conjoint Traid (CT) were used to extract the features of the protein sequence respectively, and the feature vectors obtained by these three methods were fused to obtain a new feature expression model of protein sequence. Secondly, the fused feature vector was input into the SDAE deep network to automatically learn more effective feature representation. Thirdly, the Softmax regression classifier was adopted to make the classification and prediction of subcells, and leave-one-out cross validation was performed on Viral proteins and Plant proteins datasets. Finally, the results of the proposed algorithm were compared with those of the existing algorithms such as mGOASVM (multi-label protein subcellular localization based on Gene Ontology and Support Vector Machine) and HybridGO-Loc (mining Hybrid features on Gene Ontology for predicting subcellular Localization of multi-location proteins). Experimental results show that the new algorithm achieves 98.24% accuracy on Viral proteins dataset, which is 9.35 Percentage Points higher than that of mGOASVM algorithm. And the new algorithm achieves 97.63% accuracy on Plant proteins dataset, which is 10.21 percentage points and 4.07 percentage points higher than those of mGOASVM algorithm and HybridGO-Loc algorithm respectively. To sum up, it can be shown that the proposed new algorithm can effectively improve the accuracy of the prediction of protein subcellular localization.
Reference | Related Articles | Metrics